This is the replication package for "The German Trade Shock and the Rise of the
Neo-Welfare State in Early 20th Century Britain"

Kenneth Scheve (kenneth.scheve@yale.edu) and Theo Serlin (tserlin@stanford.edu)


-------------------------------------------------------------------------------
Data


Folders

The folder "Data/_Final datasets" provides the final analysis datasets to
reproduce all results in the article.

The folder "Data/_Data for figures" provides additional data needed to
reproduce figures in the article.

The other folders in the folder "Data" provide source data to reproduce the
analysis and figures datasets.


Limitations

The data included in "Data/_Final datasets" is sufficient to reproduce all
numerical results in the paper. We do not include some of the source data
in cases where the licenses prohibit redistribution or where the data
were collected by other researchers:

We do not include the IPUMS census microdata as the license prohibits
redistributing the data. It can be downloaded from
https://international.ipums.org/international.

We do not include shapefiles for parliamentary constituencies from 
from the Great Britain Historical GIS for the same reason.
One can download them from www.VisionofBritain.org.uk.

We do not provide the data on campaign manifestos collected by Laura Bronner
and Daniel Ziblatt, as that data is not publicly available.



-------------------------------------------------------------------------------
Code


Main analysis

To replicate all tables and figures, run the R file "replication_analysis.R"
This file will save latex tables to the folder Tables and figures to the folder
figures.

Before running the code, you will need to download shapefiles for parliamentary
constituencies in England and Wales for 1885-1918 from the Great Britain
Historical GIS, www.VisionofBritain.org.uk. These are only needed to produce
Figure 4. Data for all other analyses is provided. You will also need to
install the R packages listed at the top of "replication_analysis.R",
change the working directory on line 18, and change the references to the
shapefile locations.

After running "replication_analysis.R", use the latex file
"replicated_tables_charts.tex" to create a pdf of these tables and charts.

"replication_analysis.R" also populates the folder "Tables with full output"
which  gives the full regression output for regression tables in which we
suppress coefficients of controls variables in the main text. Use the latex
file "Full_Tables_Appendix.tex" to create a pdf of these supplemental
tables.


Supplementary analyses and data-cleaning

For the analysis of terms distinguishing Beveridge's analysis of unemployment
from earlier concepts, run the R file "beveridge_terms.R".

To re-create the analysis datasets from the source data, run the R file
"replication_prepping_data.R". Before running that code, you will need to
download the 1881, 1891, 1901, and 1911 census microdata for England and
Wales from https://international.ipums.org/international/. We provide the
codebooks from the versions we used in the folders Data/Census, which
specify the variables needed. We do not provide the data on campaign
manifestos collected by Bronner and Ziblatt, but the file
"replication_prepping_data.R" does provide commented-out code on the
transformations we applied to that dataset.



-------------------------------------------------------------------------------
Outputs

The file "Extended_Appendix.pdf" contains tables and figures from the online
appendix from the paper, and additional tables and figures that were excluded
from the online appendix due to space constraints.

The file "Full_Tables_Appendix.pdf" contains tables reporting full regression
output for tables in the paper and appendix.



------------------------------------------------------------------------------
Guide to Variables

The excel spreadsheet "codebook.xlsx" contains
(1) information on the final datasets used in the analysis
(2) details and sources for all the variables in the final analysis datasets
(3) full citations for the sources of the data


------------------------------------------------------------------------------
Software

We ran all code using the following R installation:
               _                           
platform       x86_64-apple-darwin17.0     
arch           x86_64                      
os             darwin17.0                  
system         x86_64, darwin17.0          
status                                     
major          4                           
minor          0.2                         
year           2020                        
month          06                          
day            22                          
svn rev        78730                       
language       R                           
version.string R version 4.0.2 (2020-06-22)
nickname       Taking Off Again            

The code in "replication_analysis.R" for bootstrapping the CH estimator uses
mclapply from the parallel package, which does not work on Windows systems. To
run this code on a Windows computer, change the mclapply() references to 
lapply() and remove the arguments specifying the number of cores and multicore
seeds.